-
Notifications
You must be signed in to change notification settings - Fork 78
Writing MatlabOpaque objects to MAT-files #208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
CC @matthijscox Added some methods to write subsystem data... WIP. Basically just translating Python code into Julia here, so would need a review. I had to make some small adjustments to incorporate the write logic. I updated some metadata fields from I don't think performance would take a hit when splatting these finally as we don't expect too many objects in a MAT-file (after splat the total FileWrapper metadata length could be around thousands at best for really complex class definitions). There might be a better way to code the whole logic but until then this should work. I also added a new object cache for saving that would key by A side note, I was able to figure out what the |
|
Yeah this whole class system is beyond me :) I do notice there are zero tests for the new code. I guess you cannot create a minimal write test yet? |
|
Code's not complete yet,. Still working on it, trying to squeeze in a little when I'm free. No point in testing internals as the write tests would take care of it anyways |
|
Most of the code's ready. It's not working yet I'll debug soon and in the meantime add some sample tests |
src/MAT_subsys.jl
Outdated
| push!(fwrap_data, Any[]) | ||
| append!(fwrap_data, subsys.prop_vals_saved) | ||
|
|
||
| empty_struct = MatlabStructArray([]) #! FIXME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@matthijscox can you help me create a 1x0 empty struct here? I'm struggling with the syntax a bit.
The fields _c3 and prop_vals_defaults should contain a Nx1 cell array, and each cell contains a 1x0 empty struct with no fields. Here N = no. of unique classes in the file plus one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On my phone right now, but can you try something like MatlabStructArray(String[], (1,0))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading the code, this might not yet be possible. I always estimate the dimensions from the first column, but if there are no columns... Might be we have to explicitly add the dimensions to the MatlabStructArray type, so that it also works without any columns.
Does the empty struct array have any special features inside the HDF5 format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far I've been able to perform RW operations between MAT.jl and matio with the current code, but unfortunately MATLAB doesn't seem to be able to load files saved by MAT.jl.
It's going to take a while to pinpoint the issue as there are a few differences in the final HDF5 file structure between MAT.jl and MATLAB. At this point I cannot say if its because of the issue with empty structs. Let the MatlabStructArray type be as is for now, until I can identify the problem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some experimenting, I believe the issue is because of the struct field names. It looks like if the field name attribute exists, then MATLAB tries to access elements of that field during load, regardless of the struct being empty itself (this is only for object deserialization in subsystem where some kind of assignment operations are happening in the background, not regular struct loading)
I don't think there's any use case for empty structs from a user perspective, so there's no need to update the definition of MatlabStructArray itself. Instead, we can use some other kind of marker that we can detect and specifically handle the empty struct case with no fields. Can you push a fix for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, we might need to support something similar for MatlabOpaque type with empty dimensions like 1x0 0x0 or 0x1. Is this possible currently? I've seen these kinds of objects in .fig files IIRC. It's not required for normal use, but MATLAB uses these placeholders in special types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want me to define a special EmptyStructArray type? Or also to create the write function for it? I don't know the HDF5 specs for such a type... is there an easy way to load an example fully empty (no columns, 1x0 matrices) struct array type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
easy way to load an example fully empty (no columns, 1x0 matrices) struct array type
Sorry, yeah that's a good point. I'll check this out first
* subsystem shared metadata fields use these 0x0 structs
|
|
||
| # Write a struct from arrays of keys and values | ||
| function m_write(mfile::MatlabHDF5File, parent::HDF5Parent, name::String, k::Vector{String}, v::Vector) | ||
| if length(k) == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the ability to write empty dicts as 0x0 structs over here. The subsystem data works with 0x0 structs as well in MATLAB.
This eliminates the need for any new struct markers for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright. Are 0x0 structs also read as empty dicts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, all of 1x0, 0x1 and 0x0 are read as empty dicts already. But I hardcoded 0x0 for empty dict during write.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this might go against the point raised in #214 where an empty dict is being written as a 1x1 struct with no fields. Instead I guess we could either:
- Write empty dicts as 1x1 structs with no fields. For 1x0, 0x0 dims we could maybe use a special key
__dims__that contains a tuple with the dimensions. I think its very unlikely to require reading 1x0, 0x0 structs (without fields), we just need write functionality - Create a new marker for empty structs with no fields
- Update MatlabStructArray to handle empty structs
What do you suggest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a rather special corner case. I'm okay with a new type that we keep internal for the writing, and then we can always consider to refactor later:
struct EmptyStruct
dims::Vector{Int}
end|
It appears to be working now. I tried a minimal test below and it can be read in all of sample_obj = MatlabOpaque(Dict(
"a" => reshape([10.0], 1, 1),
"b" => reshape([1.0, 2.0, 3.0], 3, 1)
), "TestClass")
sample_dict = Dict(
"var" => sample_obj,
)
MAT.matwrite("test.mat", sample_dict)I'll now focus on some simple tests covering user-defined objects and handle class objects. I still need to add support for object arrays. If I've missed anything else let me know |
The objects being written are loading successfully in MATLAB (2023b). With this I think the base for writing objects is covered. We can now write methods for converting from say There's a special case of zero-dimensional objects, but again its an edge case that can be tackled if required. I also opened #213 to fix #212 , it would be easier to merge that first and then fix conflicts here. |
|
Do the tests now also cover the empty dict read/write case? If not, it would be good to add a test for that, so we don't accidentally break it again in the future. |
|
Good point. I added a write test for an empty dict. It is loaded back as an empty dict, which should be a 1x1 struct with no fields in MATLAB. |
| @test read_var["var1"] === read_var["var2"] # same object | ||
| end | ||
| end | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see tests for the special FileWrapper cases? Is that intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which one do you mean?
| write_dataset(cset, ctype, refs) | ||
| write_attribute(cset, name_type_attr_matlab, "cell") | ||
| if object_decode == UInt32(3) | ||
| write_attribute(cset, object_decode_attr_matlab, object_decode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this (FileWrapper) case tested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this code even be reached by users?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part 1 addressed in the other comment. As to your second question, its not reachable by users. The FileWrapper object is some kind of special container used by MATLAB which stores data in a Cell array. This is just a simple workaround to write a cell array for MatlabOpaque types whose data is a cell array.
As far as I know, no other MatlabOpaque types use cell arrays to store (meta)data, so it will only execute for FileWrapper.
| function m_write(mfile::MatlabHDF5File, parent::HDF5Parent, name::String, obj::MatlabOpaque) | ||
| error("writing of MatlabOpaque types is not yet supported") | ||
| if obj.class == "FileWrapper__" | ||
| m_write(mfile, parent, name, obj["__filewrapper__"], UInt32(3)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this FileWrapper case tested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's implicitly tested with the object tests. All object data + metadata is stored within this FileWrapper__ object in a MAT-file, so it is not possible to load an object without having the FileWrapper object written correctly to a MAT-file.
|
Alright, everything looks okay to me. Do you have the power to merge already (since you are a contributor now) or do I have to make you an official maintainer for that? |
|
I can't merge yet |
|
I'm only a member of JuliaIO I see, so cannot add you as maintainer unfortunately. I'll just have to merge PRs I suppose |
|
No issues, you can also close #214 |
Part of a series of commits to support write operations for
MatlabOpaqueobjects. For now I just rearranged the methods to make it easy for me to update the code.I'll first add to
MAT_subsys.jlbefore updatingMAT_HDFandMAT_v5.